Registration has reached capacity. Join the waitlist

SEAR: Schema-Based Evaluation and Routing for LLM Gateways

Zecheng Zhang (Strukto.AI), Han Zheng (Infron.AI), Yue Xu (Infron.AI)

Engineering & Operations Architectural Patterns & Composition

SEAR is a production evaluation and routing system for multi-model LLM gateways that exposes ~100 typed, SQL-queryable quality signals—covering intent, response characteristics, issue attribution, and scores—to drive fine-grained routing decisions across providers. It makes LLM gateway behavior observable and steerable in a way that provider-level metrics alone cannot.

Presentation

Talk

Paper Session 8: AI Systems in Practice

Friday, May 29 · 12:40 PM – 12:50 PM

Bayshore Ballroom

Poster

Friday, May 29 · 1:45 PM – 3:15 PM

Carmel / Monterey

View day schedule

Abstract

Evaluating production LLM responses and routing requests across providers in LLM gateways requires fine-grained quality signals and operationally grounded decisions. To address this gap, we present SEAR, a schema-based evaluation and routing system for multi-model, multi-provider LLM gateways. SEAR defines an extensible relational schema covering both LLM evaluation signals (context, intent, response characteristics, issue attribution, and quality scores) and gateway operational metrics (latency, cost, throughput), with cross-table consistency links across around one hundred typed, SQL-queryable columns. To populate the evaluation signals reliably, SEAR proposes self-contained signal instructions, in-schema reasoning, and multi-stage generation that produces database-ready structured outputs. Because signals are derived through LLM reasoning rather than shallow classifiers, SEAR captures complex request semantics, enables human-interpretable routing explanations, and unifies evaluation and routing in a single query layer. Across thousands of production sessions, SEAR achieves strong signal accuracy on human-labeled data and supports practical routing decisions, including large cost reductions with comparable quality.

Artifacts & Links

                        Authors
                        Zecheng Zhang
Strukto.AI
Han Zheng
Infron.AI
Yue Xu
Infron.AI